Multilingual number transcription for text-to-speech conversion
نویسندگان
چکیده
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.
منابع مشابه
GlobalPhone: A Multilingual Text & Speech Database in 20 Languages
This paper describes the advances in the multilingual text and speech database GlobalPhone, a multilingual database of highquality read speech with corresponding transcriptions and pronunciation dictionaries in 20 languages. GlobalPhone was designed to be uniform across languages with respect to the amount of data, speech quality, the collection scenario, the transcription and phone set convent...
متن کاملA Qualitative Evaluation of Phoneme-to-Phoneme Technology
Automatic speech recognition systems apply grapheme-to phoneme transcription (G2P) to model pronunciation of items in the lexicon. General purpose G2P transcriptions are not always accurate, e.g., in a multilingual environment. To improve the transcription quality, G2P transcriptions can be postprocessed using a phoneme-to-phoneme (P2P) converter. This paper discusses the applicability of P2P t...
متن کاملEfficient Multilingual Phoneme-to-Grapheme Conversion Based on HMM
Grapheme-to-phoneme conversion (GTPC) has been achieved in most European languagesby dictionary look-up or using rules. The application of these methods, however, in the reverse process, (i.e., in phoneme-to-grapheme conversion [PTGC]) creates serious problems, especially in inflectionally rich languages. In this paper the PTGC problem is approached from a completely different point of view. In...
متن کاملTowards multilingual interoperability in automatic speech recognition
In this communication, we address multilingual interoperability aspects in speech recognition. After giving a tentative definition of multilingual interoperability, we discuss speech recognition components and their language-specific aspects. We give a sample overview of past multilingual speech recognition research and development across different speaking styles (read, prepared and conversati...
متن کاملSpoken language processing in a multilingual context
In this paper we overview the spoken language processing activities at LIMSI, which are carried out in a multilingual framework. These activities include speech-to-text conversion, spoken language systems for information retrieval, speaker and language recognition, and speech response. The Spoken Language Processing Group has also been actively involved in corpora development and evaluation. Th...
متن کامل